Back to the Future: an Even More Nearly Optimal Cardinality Estimation Algorithm

نویسنده

  • Kevin J. Lang
چکیده

We describe a new cardinality estimation algorithm that is extremely space-efficient. It applies one of three novel estimators to the compressed state of the Flajolet-Martin-85 coupon collection process. In an apples-to-apples empirical comparison against compressed HyperLogLog sketches, the new algorithm simultaneously wins on all three dimensions of the time/space/accuracy tradeoff. Our prototype uses the zstd compression library, and produces sketches that are smaller than the entropy of HLL, so no possible implementation of compressed HLL can match its space efficiency. The paper’s technical contributions include analyses and simulations of the three new estimators, accurate values for the entropies of FM85 and HLL, and a non-trivial method for estimating a double asymptotic limit via simulation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation of groundwater level using a hybrid genetic algorithm-neural network

In this paper, we present an application of evolved neural networks using a real coded genetic algorithm for simulations of monthly groundwater levels in a coastal aquifer located in the Shabestar Plain, Iran. After initializing the model with groundwater elevations observed at a given time, the developed hybrid genetic algorithm-back propagation (GA-BP) should be able to reproduce groundwater ...

متن کامل

Estimation of groundwater level using a hybrid genetic algorithm-neural network

In this paper, we present an application of evolved neural networks using a real coded genetic algorithm for simulations of monthly groundwater levels in a coastal aquifer located in the Shabestar Plain, Iran. After initializing the model with groundwater elevations observed at a given time, the developed hybrid genetic algorithm-back propagation (GA-BP) should be able to reproduce groundwater ...

متن کامل

A Soft-Input Soft-Output Target Detection Algorithm for Passive Radar

Abstract: This paper proposes a novel scheme for multi-static passive radar processing, based on soft-input soft-output processing and Bayesian sparse estimation. In this scheme, each receiver estimates the probability of target presence based on its received signal and the prior information received from a central processor. The resulting posterior target probabilities are transmitted to the c...

متن کامل

Stock Portfolio-Optimization Model by Mean-Semi-Variance Approach Using of Firefly Algorithm and Imperialist Competitive Algorithm

Selecting approaches with appropriate accuracy and suitable speed for the purpose of making decision is one of the managers’ challenges. Also investing decision is one of the main decisions of managers and it can be referred to securities transaction in financial markets which is one of the investments approaches. When some assets and barriers of real world have been considered, optimization of...

متن کامل

Signal Prediction by Layered Feed - Forward Neural Network (RESEARCH NOTE).

In this paper a nonparametric neural network (NN) technique for prediction of future values of a signal based on its past history is presented. This approach bypasses modeling, identification, and parameter estimation phases that are required by conventional parametric techniques. A multi-layer feed forward NN is employed. It develops an internal model of the signal through a training operation...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1708.06839  شماره 

صفحات  -

تاریخ انتشار 2017